https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE119290
if (!requireNamespace("knitr", quietly = TRUE))
install.packages("knitr")
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
if (!("DESeq2" %in% installed.packages())) {
BiocManager::install("DESeq2", update = FALSE)
}
if (!("EnhancedVolcano" %in% installed.packages())) {
BiocManager::install("EnhancedVolcano", update = FALSE)
}
if (!("apeglm" %in% installed.packages())) {
BiocManager::install("apeglm", update = FALSE)
}
if (!("pheatmap" %in% installed.packages())) {
BiocManager::install("pheatmap", update = FALSE)
}
if (!("gprofiler2" %in% installed.packages())) {
BiocManager::install("gprofiler2", update = FALSE)
}
if (!("clusterProfiler" %in% installed.packages())) {
BiocManager::install("clusterProfiler", update = FALSE)
}
if (!("M3C" %in% installed.packages())) {
BiocManager::install("M3C", update = FALSE)
}
library(M3C)
library(umap)
Attaching package: ‘umap’
The following object is masked from ‘package:M3C’:
umap
library(Rtsne)
library(clusterProfiler)
Registered S3 method overwritten by 'data.table':
method from
print.data.table
clusterProfiler v4.0.5 For help: https://yulab-smu.top/biomedical-knowledge-mining-book/
If you use clusterProfiler in published research, please cite:
T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. The Innovation. 2021, 2(3):100141. doi: 10.1016/j.xinn.2021.100141
Attaching package: ‘clusterProfiler’
The following object is masked from ‘package:stats’:
filter
library(gprofiler2)
Registered S3 method overwritten by 'htmlwidgets':
method from
print.htmlwidget tools:rstudio
library ("pheatmap")
library ("RColorBrewer")
library(magrittr)
library(matrixStats)
library(ggplot2)
library(DESeq2)
Loading required package: S4Vectors
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport,
clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply,
parSapply, parSapplyLB
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames,
dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order,
paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce,
rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit,
which.max, which.min
Attaching package: ‘S4Vectors’
The following object is masked from ‘package:clusterProfiler’:
rename
The following objects are masked from ‘package:base’:
expand.grid, I, unname
Loading required package: IRanges
Attaching package: ‘IRanges’
The following object is masked from ‘package:clusterProfiler’:
slice
The following object is masked from ‘package:grDevices’:
windows
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Attaching package: ‘MatrixGenerics’
The following objects are masked from ‘package:matrixStats’:
colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse, colCounts,
colCummaxs, colCummins, colCumprods, colCumsums, colDiffs, colIQRDiffs,
colIQRs, colLogSumExps, colMadDiffs, colMads, colMaxs, colMeans2,
colMedians, colMins, colOrderStats, colProds, colQuantiles, colRanges,
colRanks, colSdDiffs, colSds, colSums2, colTabulates, colVarDiffs,
colVars, colWeightedMads, colWeightedMeans, colWeightedMedians,
colWeightedSds, colWeightedVars, rowAlls, rowAnyNAs, rowAnys,
rowAvgsPerColSet, rowCollapse, rowCounts, rowCummaxs, rowCummins,
rowCumprods, rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks, rowSdDiffs,
rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars, rowWeightedMads,
rowWeightedMeans, rowWeightedMedians, rowWeightedSds, rowWeightedVars
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with 'browseVignettes()'.
To cite Bioconductor, see 'citation("Biobase")', and for packages
'citation("pkgname")'.
Attaching package: ‘Biobase’
The following object is masked from ‘package:MatrixGenerics’:
rowMedians
The following objects are masked from ‘package:matrixStats’:
anyMissing, rowMedians
library(DOSE)
DOSE v3.18.2 For help: https://guangchuangyu.github.io/software/DOSE
If you use DOSE in published research, please cite:
Guangchuang Yu, Li-Gen Wang, Guang-Rong Yan, Qing-Yu He. DOSE: an R/Bioconductor package for Disease Ontology Semantic and Enrichment analysis. Bioinformatics 2015, 31(4):608-609
library('org.Hs.eg.db')
Loading required package: AnnotationDbi
Attaching package: ‘AnnotationDbi’
The following object is masked from ‘package:clusterProfiler’:
select
library(data.table)
data.table 1.14.0 using 6 threads (see ?getDTthreads). Latest news: r-datatable.com
Attaching package: ‘data.table’
The following object is masked from ‘package:SummarizedExperiment’:
shift
The following object is masked from ‘package:GenomicRanges’:
shift
The following object is masked from ‘package:IRanges’:
shift
The following objects are masked from ‘package:S4Vectors’:
first, second
###### Load the data into R.######
matrix_of_data <- as.matrix(read.table("GSE119290_Readhead_2018_RNAseq_gene_counts.txt"))
coldata<-read.csv("coldata.csv",header = T,row.names=1,stringsAsFactors=T)
##### calculate per-gene expression ranges and generating a density plot #####
matrix_of_ranges <- rowRanges(matrix_of_data, rows = NULL, cols = NULL, na.rm = FALSE, dim. = dim(matrix_of_data), useNames = NA)
vector_of_ranges <- 0
for(i in 1:26364) {
vector_of_ranges[i] <- matrix_of_ranges[i,2]-matrix_of_ranges[i,1]
}
vector_of_ranges
[1] 41 1767 132 132 132 132 5 5 5 5 3
[12] 3 1 1159 3270 1598 1598 22 22 22 6 6795
[23] 780 39 124 1111 28 102 2021 7450 888 32 15
[34] 1169 1133 98732 4 1028 41 1 1 1 14 77
[45] 6 10563 4508 43 3752 167 4599 8 1832 4976 51
[56] 1677 99 4426 40 1061 7033 5415 1625 5290 129 15
[67] 11 1244 367 3203 5804 274 5384 1102 3048 307 597
[78] 4254 10132 2897 7032 3740 49505 12 32 28 10 1705
[89] 3949 13540 706 209 6716 3721 53 2980 2161 28 117
[100] 13 2897 634 10 2 207 2181 1 534 969 3
[111] 923 833 525 3940 10 53 3687 1822 561 1697 25
[122] 2 0 139 0 3 1687 359 119 17325 23 188
[133] 8122 21 388 3736 7508 374 378 1 1179 3495 3052
[144] 7 1133 3149 3152 2788 3824 2 2872 3570 599 14
[155] 35 13253 21306 2 900 24204 20 157014 124 22 2
[166] 2 10 50 8 6421 500 231 910 4853 1140 1
[177] 72 21989 2971 1317 617 4 25 4932 39241 95 19906
[188] 902 897 210 2739 1447 878 15 14160 275 8176 5336
[199] 8058 176 35 1376 2751 2 222 1121 268 7510 1895
[210] 1698 357 2 1425 5122 1131 366 794 2488 10403 7352
[221] 2488 17 13 1 1447 1 3904 28 28 79 2
[232] 1 1 7 4 3 2 8 8 8 3 2
[243] 3 3 2 1 3 1 1 1 1 2 7
[254] 7 6 2 2 3 5 5 0 1 2 2
[265] 5 11017 4457 3006 799 1200 5 81 2735 9 3
[276] 40 532 1530 78 775 641 5069 185 3 6697 211
[287] 582 12264 9 3009 6 2 5 18 32 21303 539
[298] 565 2331 11170 9 2800 1331 1 0 7928 2279 69
[309] 2 2 5 68 3412 2901 3994 2805 183 12 9
[320] 0 5 3 18070 4617 567 1 26 740 1813 2
[331] 3029 4 5 4281 17509 8807 8857 95 151 707 2586
[342] 1738 13335 5025 2535 49 1127 3 84 0 2144 1
[353] 11 3 6 3 2 31 238 39 8 5195 2223
[364] 15 3 3255 72 3644 10838 1015 2591 15594 9933 5
[375] 17752 18 536 1613 1066 4797 2635 16293 1 1 40
[386] 570 14671 3622 0 3041 17 0 1 1 1 15623
[397] 1 518 5 6 12259 10 6 4128 129 331 23559
[408] 8387 1018 63 4543 2741 100 22553 3 29344 4179 935
[419] 3462 4903 1563 1042 504 4 0 7498 12001 4 6
[430] 9 4 58 342 815 1 90 26 8 8991 21268
[441] 16 0 0 1639 1339 59 4163 79 1386 724 5571
[452] 21117 2896 1106 843 1141 41159 2450 777 97 3 3
[463] 1144 65 1962 187 2 4 2725 7751 1651 3 86
[474] 4 4013 1490 41 38661 2325 2 18825 450 2091 64
[485] 1570 851 3 10816 14 1205 1767 1479 4063 2372 999
[496] 60 1435 24 6 332 8754 3491 2 916 451 3444
[507] 3776 9 155 3773 15 255 2095 25 6935 4965 4266
[518] 1116 6348 4278 8453 61 829 887 45 190 45 58
[529] 61 727 13 1005 5527 58 12478 1535 8136 1081 4099
[540] 2 2 17 13 38 0 58965 8804 48 48 8
[551] 4623 4205 2561 796 4389 1 4 59 8 4827 268
[562] 27561 9 1472 9841 13416 1426 9 7058 5237 1884 435
[573] 119 445 11258 212 29 4 6424 58752 133 214 2859
[584] 406 1225 1072 20138 4261 1299 7011 2394 1175 416 3532
[595] 1964 3419 223 890 4260 18 12015 657 21 336 795
[606] 2 3 40 0 22 8 42 19 2642 356 76
[617] 375 863 1679 34128 7909 5890 4032 92 13020 1311 6221
[628] 2357 10820 1650 531 1944 27 2058 11474 12352 206 3390
[639] 9894 2738 775 2917 3 1165 0 60 169 4 2908
[650] 28 1154 3216 3662 42 1609 5663 42 3752 1483 5248
[661] 1703 1228 5631 1294 2270 1357 6 17 1455 1736 1681
[672] 58 30 8027 17851 51909 10255 796 307 5994 1449 22
[683] 263 6 1516 3173 215 36 1138 1712 3398 26685 6738
[694] 2568 3 3967 887 2473 480 611 620 300 3409 666
[705] 2968 2 1 95 207 7894 491 138 2957 1590 38
[716] 6122 1 1 3461 878 13 705 167 2014 61381 228
[727] 2941 467 53 1262 1054 461 15 27625 87 127 6476
[738] 20 183 5 9 783 12 13014 4541 16 1813 3820
[749] 4 838 35040 5807 630 1535 12 4 115 2705 2215
[760] 5474 6074 413 24280 27 2039 4712 80 9712 7 872
[771] 42 8123 41389 59 42 40 58 16 1217 76 501
[782] 771 2002 2805 2923 1185 73 130 1854 1299 114 214
[793] 674 23224 5373 18910 93 4042 36 1326 1226 5819 8565
[804] 6 23 6757 363 2342 6000 8649 1486 251 2 1
[815] 13 133 3 687 363 2050 7 230 6918 3 0
[826] 2 26 2 2 1 17 783 3027 5010 20 6
[837] 29 13 128 7 246 2595 18 751 1 0 850
[848] 69 2554 2882 0 0 0 4508 35 1 3802 4418
[859] 6728 47 498 3875 711 15 4362 2074 1724 2482 4247
[870] 4964 311 18 1248 4877 206 160 3938 4 5 2
[881] 752 34 1137 2592 2660 634 14466 19 1 1 10
[892] 5272 1454 27 1644 2090 1 4392 843 1 7 629
[903] 4351 10736 275 60 22 2 1199 1246 486 4 8
[914] 38718 3 1 1576 7683 76 0 11118 1959 22 2
[925] 6 2343 5 291 187 2998 24724 6 25 7 1212
[936] 2 12 291 25 1 101 14 6835 7 3 607
[947] 1816 1 1 11 105 7893 8324 24 923 3 99
[958] 83 0 602 1349 750 82 3090 1334 9 1 8227
[969] 2 2002 3823 11 11 1 5001 1142 4180 851 8345
[980] 8 194 1 36 7 419 2152 1692 11 2 23
[991] 29496 3526 15995 355 4054 37225 7 2697 2902 38
[ reached getOption("max.print") -- omitted 25364 entries ]
plot(density(vector_of_ranges), log='x')
Warning in xy.coords(x, y, xlabel, ylabel, log) :
1 x value <= 0 omitted from logarithmic plot
PCA
##### generate a PCA plot #######
dds <- DESeqDataSetFromMatrix(countData = matrix_of_data,colData = coldata,design = ~ dex)
dds <- dds[rowSums(counts(dds)) > 1,]
vst <- vst(dds,blind = FALSE)
plotPCA(vst, intgroup=c("dex"))
UMAP
matrix_of_data_t <- t(matrix_of_data)
df.umap = umap(matrix_of_data_t)
dimension1 <- range(df.umap$layout)
dimension2 <- dimension1
plot(dimension1, dimension2, type="n")
points(df.umap$layout[,1], df.umap$layout[,2], col=as.integer(coldata[,"dex"])+2, cex=2, pch=20)
labels.u = unique(coldata[,"dex"])
legend.text = as.character(labels.u)
legend.pos = "bottomleft"
legend.text = paste(as.character(labels.u), "")
legend(legend.pos, legend=legend.text, inset=0.03,col=as.integer(labels.u)+2,bty="n", pch=20, cex=1)
Summary for PCA/UMAP
#### Perform differential analysis on the samples from your two groups #####
deseq_object <- DESeq(dds)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
-- replacing outliers and refitting for 308 genes
-- DESeq argument 'minReplicatesForReplace' = 7
-- original counts are preserved in counts(dds)
estimating dispersions
fitting model and testing
deseq_results <- results(deseq_object)
deseq_results <- lfcShrink(deseq_object,coef = 2, res = deseq_results )
using 'apeglm' for LFC shrinkage. If used in published research, please cite:
Zhu, A., Ibrahim, J.G., Love, M.I. (2018) Heavy-tailed prior distributions for
sequence count data: removing the noise and preserving large differences.
Bioinformatics. https://doi.org/10.1093/bioinformatics/bty895
head(deseq_results)
log2 fold change (MAP): dex SZ vs Control
Wald test p-value: dex SZ vs Control
DataFrame with 6 rows and 5 columns
baseMean log2FoldChange lfcSE pvalue padj
<numeric> <numeric> <numeric> <numeric> <numeric>
DDX11L1 15.2075 0.0185428 0.144337 0.785012 0.891035
WASH7P 1157.3822 0.1255759 0.110817 0.140908 0.348768
MIR6859-3 84.1270 0.1824412 0.131918 0.057109 0.200511
MIR6859-2 84.1270 0.1824412 0.131918 0.057109 0.200511
MIR6859-4 84.1270 0.1824412 0.131918 0.057109 0.200511
MIR6859-1 84.1270 0.1824412 0.131918 0.057109 0.200511
deseq_df <- deseq_results %>%
# make into data.frame
as.data.frame() %>%
# the gene names are row names -- let's make them a column for easy display
tibble::rownames_to_column("Gene") %>%
# add a column for significance threshold results
dplyr::mutate(threshold = padj < 0.05) %>%
# sort by statistic -- the highest values will be genes with
# higher expression in RPL10 mutated samples
dplyr::arrange(dplyr::desc(log2FoldChange))
head(deseq_df)
plotCounts(dds, gene = "DDX11L1", intgroup = "dex")
Volcano plot
#volcano plot here
volcano_plot <- EnhancedVolcano::EnhancedVolcano(
deseq_df,lab = deseq_df$Gene,x = "log2FoldChange",
y = "padj",pCutoff = 0.01 )
Warning: Ignoring unknown parameters: xlim, ylim
volcano_plot
Summary of volcano plot
Heatmap
#heatmap
vst <- vst(dds,blind = FALSE)
sampleDists <- dist(t(assay(vst)))
sampleDists
Sample_LI-01 Sample_LI-02 Sample_LI-03 Sample_LI-04 Sample_LI-05
Sample_LI-02 144.42095
Sample_LI-03 96.21431 133.97114
Sample_LI-04 99.87450 139.71166 37.77497
Sample_LI-05 83.60813 156.00739 101.78338 103.29480
Sample_LI-06 119.44816 116.22757 120.67800 123.34283 96.11964
Sample_LI-07 95.37821 155.36706 112.59969 113.70332 76.26127
Sample_LI-08 95.26113 141.14019 109.07132 108.09854 78.42244
Sample_LI-09 80.44241 130.44555 99.27533 104.59342 58.80801
Sample_LI-10 84.61049 159.95587 101.84089 97.85090 50.30949
Sample_LI-11 89.07397 124.77483 80.79587 83.05840 91.87756
Sample_LI-12 42.10256 153.45828 95.05450 93.44808 80.13373
Sample_LI-13 93.57608 146.53943 89.62124 92.59267 71.68491
Sample_LI-14 98.06298 144.25071 93.04808 91.18505 75.18324
Sample_LI-15 88.27419 145.39448 96.89075 99.89449 72.05981
Sample_LI-16 91.47799 134.87317 95.56142 94.36748 76.91570
Sample_LI-17 110.21659 156.31258 108.56673 112.58090 93.57011
Sample_LI-18 115.32256 165.33256 111.34182 106.54065 96.07806
Sample_LI-19 117.31859 167.04564 123.81656 126.89681 116.25921
Sample_LI-20 117.22968 160.07730 119.69261 118.73346 114.31221
Sample_LI-23 115.82222 159.04777 117.81344 119.68953 106.69378
Sample_LI-24 117.43344 161.46974 117.82703 114.48324 106.97566
Sample_LI-25 114.69350 134.34106 78.37575 79.54452 105.88925
Sample_LI-27 107.69827 150.74338 98.20624 102.74555 101.89783
Sample_LI-06 Sample_LI-07 Sample_LI-08 Sample_LI-09 Sample_LI-10
Sample_LI-02
Sample_LI-03
Sample_LI-04
Sample_LI-05
Sample_LI-06
Sample_LI-07 115.94728
Sample_LI-08 101.09101 39.36980
Sample_LI-09 75.32779 84.06512 78.70855
Sample_LI-10 101.46811 77.90681 75.79231 61.25290
Sample_LI-11 124.53992 90.01049 90.30398 96.56694 92.11518
Sample_LI-12 119.25132 91.88427 91.16393 84.22445 73.28430
Sample_LI-13 108.62476 78.17900 80.65945 81.27663 78.69382
Sample_LI-14 100.26206 80.47143 76.53043 83.31679 75.45347
Sample_LI-15 108.87130 68.31503 72.22288 79.79228 80.26742
Sample_LI-16 96.16823 75.87451 68.34821 79.38293 78.20575
Sample_LI-17 121.32108 79.21296 81.19500 97.19180 98.54813
Sample_LI-18 125.66399 82.89720 79.18897 107.02390 90.33651
Sample_LI-19 147.07707 92.33292 97.36230 118.90778 119.27992
Sample_LI-20 136.86663 91.68664 89.17978 115.67221 112.69894
Sample_LI-23 132.85037 86.20527 87.29910 108.25278 108.02152
Sample_LI-24 133.33938 86.94077 83.42641 112.45640 102.49730
Sample_LI-25 117.89541 115.89225 112.34720 106.37446 107.46017
Sample_LI-27 130.49242 101.04756 103.66152 101.12589 103.96036
Sample_LI-11 Sample_LI-12 Sample_LI-13 Sample_LI-14 Sample_LI-15
Sample_LI-02
Sample_LI-03
Sample_LI-04
Sample_LI-05
Sample_LI-06
Sample_LI-07
Sample_LI-08
Sample_LI-09
Sample_LI-10
Sample_LI-11
Sample_LI-12 86.55308
Sample_LI-13 80.22686 89.71738
Sample_LI-14 85.40616 90.37432 37.77026
Sample_LI-15 82.77390 86.44219 39.10570 46.18250
Sample_LI-16 85.62742 86.45278 48.33529 39.87597 40.29908
Sample_LI-17 102.53927 108.27463 81.33364 84.02504 81.23823
Sample_LI-18 103.42611 104.93807 87.07794 80.62000 87.91855
Sample_LI-19 114.69949 117.15625 106.04336 110.89265 101.00510
Sample_LI-20 112.69567 113.06653 104.30384 102.93066 99.72611
Sample_LI-23 108.10184 113.69641 98.01200 99.75891 93.77137
Sample_LI-24 107.78078 110.80109 100.49780 96.56208 95.63683
Sample_LI-25 83.09330 111.47755 81.20942 83.53271 92.85541
Sample_LI-27 95.66504 108.89038 90.49695 96.61505 95.68051
Sample_LI-16 Sample_LI-17 Sample_LI-18 Sample_LI-19 Sample_LI-20
Sample_LI-02
Sample_LI-03
Sample_LI-04
Sample_LI-05
Sample_LI-06
Sample_LI-07
Sample_LI-08
Sample_LI-09
Sample_LI-10
Sample_LI-11
Sample_LI-12
Sample_LI-13
Sample_LI-14
Sample_LI-15
Sample_LI-16
Sample_LI-17 85.23606
Sample_LI-18 82.64446 52.15115
Sample_LI-19 106.81797 76.73758 90.05378
Sample_LI-20 98.16905 76.69849 76.56972 43.88827
Sample_LI-23 96.33758 61.38807 72.16698 58.01886 58.54225
Sample_LI-24 92.77982 71.02220 61.30604 70.22445 56.41068
Sample_LI-25 90.58481 116.67682 116.47200 136.97377 131.50994
Sample_LI-27 100.95401 104.69435 111.40654 113.49061 114.35801
Sample_LI-23 Sample_LI-24 Sample_LI-25 Sample_LI-27 Sample_LI-28
Sample_LI-02
Sample_LI-03
Sample_LI-04
Sample_LI-05
Sample_LI-06
Sample_LI-07
Sample_LI-08
Sample_LI-09
Sample_LI-10
Sample_LI-11
Sample_LI-12
Sample_LI-13
Sample_LI-14
Sample_LI-15
Sample_LI-16
Sample_LI-17
Sample_LI-18
Sample_LI-19
Sample_LI-20
Sample_LI-23
Sample_LI-24 42.82230
Sample_LI-25 126.77162 125.40196
Sample_LI-27 111.71844 114.67548 101.83543
Sample_LI-29 Sample_LI-30 Sample_LI-31 Sample_LI-32 Sample_LI-33
Sample_LI-02
Sample_LI-03
Sample_LI-04
Sample_LI-05
Sample_LI-06
Sample_LI-07
Sample_LI-08
Sample_LI-09
Sample_LI-10
Sample_LI-11
Sample_LI-12
Sample_LI-13
Sample_LI-14
Sample_LI-15
Sample_LI-16
Sample_LI-17
Sample_LI-18
Sample_LI-19
Sample_LI-20
Sample_LI-23
Sample_LI-24
Sample_LI-25
Sample_LI-27
Sample_LI-34 Sample_LI-36 Sample_LI-37 Sample_LI-38 Sample_LI-39
Sample_LI-02
Sample_LI-03
Sample_LI-04
Sample_LI-05
Sample_LI-06
Sample_LI-07
Sample_LI-08
Sample_LI-09
Sample_LI-10
Sample_LI-11
Sample_LI-12
Sample_LI-13
Sample_LI-14
Sample_LI-15
Sample_LI-16
Sample_LI-17
Sample_LI-18
Sample_LI-19
Sample_LI-20
Sample_LI-23
Sample_LI-24
Sample_LI-25
Sample_LI-27
Sample_LI-40 Sample_LI-41 Sample_LI-42 Sample_LI-43 Sample_LI-44
Sample_LI-02
Sample_LI-03
Sample_LI-04
Sample_LI-05
Sample_LI-06
Sample_LI-07
Sample_LI-08
Sample_LI-09
Sample_LI-10
Sample_LI-11
Sample_LI-12
Sample_LI-13
Sample_LI-14
Sample_LI-15
Sample_LI-16
Sample_LI-17
Sample_LI-18
Sample_LI-19
Sample_LI-20
Sample_LI-23
Sample_LI-24
Sample_LI-25
Sample_LI-27
Sample_LI-45 Sample_LI-46 Sample_LI-47
Sample_LI-02
Sample_LI-03
Sample_LI-04
Sample_LI-05
Sample_LI-06
Sample_LI-07
Sample_LI-08
Sample_LI-09
Sample_LI-10
Sample_LI-11
Sample_LI-12
Sample_LI-13
Sample_LI-14
Sample_LI-15
Sample_LI-16
Sample_LI-17
Sample_LI-18
Sample_LI-19
Sample_LI-20
Sample_LI-23
Sample_LI-24
Sample_LI-25
Sample_LI-27
[ reached getOption("max.print") -- omitted 20 rows ]
sampleDistMatrix <- as.matrix( sampleDists )
rownames(sampleDistMatrix) <- paste( vst$dex, sep = " - " )
colnames(sampleDistMatrix) <- NULL
colors <- colorRampPalette( rev(brewer.pal(9, "Blues")) )(255)
pheatmap(sampleDistMatrix,clustering_distance_rows = sampleDists,
clustering_distance_cols = sampleDists,col = colors)
topVarGenes <- head(order(rowVars(assay(vst)), decreasing = TRUE), 20)
topVarGenes
[1] 23946 23945 24456 15776 24464 1660 10189 24311 1652 24481 6057 24463 22385
[14] 15849 24483 10106 10188 20848 10186 21054
mat <- assay(vst)[topVarGenes, ]
mat<-mat-rowMeans(mat)
anno <- as.data.frame(colData(vst)[c("dex")])
pheatmap(mat, annotation_col = anno)
Method/Ontology 1
#Method 1 code
topVarGenesGO <- head(order(rowVars(assay(vst)), decreasing = TRUE), 100)
topVarGenesGO
[1] 23946 23945 24456 15776 24464 1660 10189 24311 1652 24481 6057 24463 22385
[14] 15849 24483 10106 10188 20848 10186 21054 17462 16399 24462 22 24480 15850
[27] 24129 2580 20649 24457 20650 9553 24465 12523 4972 4259 10185 22806 10187
[40] 10184 15491 3454 20648 9739 21229 18715 17622 21773 3455 22294 7321 10181
[53] 10908 20645 21227 23012 20653 5159 4477 5001 3052 15389 23982 13827 21651
[66] 3453 20454 3720 4449 17605 19980 11498 17055 1707 17248 6386 6061 1509
[79] 20557 23515 23131 3711 3755 19040 22399 6996 5606 2536 18929 11051 6170
[92] 1511 10518 18091 15206 2875 20651 14865 7057 2894
mat_100 <- assay(vst)[topVarGenesGO, ]
gostres <- gost(query = rownames(mat_100),
organism = "hsapiens", ordered_query = FALSE,
multi_query = FALSE, significant = TRUE, exclude_iea = FALSE,
measure_underrepresentation = FALSE, evcodes = FALSE,
user_threshold = 0.05, correction_method = "g_SCS",
domain_scope = "annotated", custom_bg = NULL,
numeric_ns = "", sources = NULL, as_short_link = FALSE)
names(gostres)
[1] "result" "meta"
head(gostres$result, 6)
names(gostres$meta)
[1] "query_metadata" "result_metadata" "genes_metadata" "timestamp"
[5] "version"
gostplot(gostres, capped = TRUE, interactive = TRUE)
plot <- gostplot(gostres, capped = FALSE, interactive = FALSE)
plot
publish_gosttable(gostres, highlight_terms = gostres$result[c(1:2,10,120),],
use_colors = TRUE,
show_columns = c("source", "term_name", "term_size", "intersection_size"),
filename = NULL)
The input 'highlight_terms' is a data.frame. The column 'term_id' will be used.
summary of Method/Ontology 1
Method/Ontology 2
#Method 2 code
#get differentially expressed genes
dds <- DESeqDataSetFromMatrix(countData = matrix_of_data,colData = coldata,design = ~ dex)
res <- results(DESeq(dds))
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
-- replacing outliers and refitting for 308 genes
-- DESeq argument 'minReplicatesForReplace' = 7
-- original counts are preserved in counts(dds)
estimating dispersions
fitting model and testing
#store values
geneList = res[,2]
#create list of genes
names_list = rownames(res)
#lead gene names into symbols variable
symbols <- rownames(res)
#map symbols to IDs
gene_ids <- mapIds(org.Hs.eg.db, symbols, 'ENTREZID', 'SYMBOL')
'select()' returned 1:many mapping between keys and columns
#update ID's in list of gene names
for(i in 1:26364){
names_list[i] = gene_ids[names_list[i]]
}
names(geneList) = names_list
geneList = sort(geneList, decreasing = TRUE)
gene <- names(geneList)[abs(geneList) > 1.5]
x <- enrichDO(gene = gene,
ont = "DO",
pvalueCutoff = 0.05,
pAdjustMethod = "BH",
universe = names(geneList),
minGSSize = 5,
maxGSSize = 500,
qvalueCutoff = 0.05,
readable = TRUE)
output_table <- x@result
setDT(output_table)
output_table
y <- gseDO(geneList,
minGSSize = 120,
pvalueCutoff = 0.2,
pAdjustMethod = "BH",
verbose = FALSE)
Warning in preparePathwaysAndStats(pathways, stats, minSize, maxSize, gseaParam, :
There are ties in the preranked stats (6.27% of the list).
The order of those tied genes will be arbitrary, which may produce unexpected results.
Warning in preparePathwaysAndStats(pathways, stats, minSize, maxSize, gseaParam, :
There are duplicate gene names, fgsea may produce unexpected results.
Warning in serialize(data, node$con) :
'package:stats' may not be available when loading
Warning in serialize(data, node$con) :
'package:stats' may not be available when loading
Warning in serialize(data, node$con) :
'package:stats' may not be available when loading
Warning in serialize(data, node$con) :
'package:stats' may not be available when loading
Warning in serialize(data, node$con) :
'package:stats' may not be available when loading
Warning in serialize(data, node$con) :
'package:stats' may not be available when loading
Warning in serialize(data, node$con) :
'package:stats' may not be available when loading
Warning in serialize(data, node$con) :
'package:stats' may not be available when loading
Warning in serialize(data, node$con) :
'package:stats' may not be available when loading
Warning in serialize(data, node$con) :
'package:stats' may not be available when loading
output_table <- y@result
setDT(output_table)
output_table
summary of Method/Ontology 2
Method/Ontology 3
summary of Method/Ontology 3